Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move databricks provider to new structure #46207

Merged

Conversation

josix
Copy link
Contributor

@josix josix commented Jan 28, 2025

related: #46045

  • fix unit tests

^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@potiuk potiuk force-pushed the refactor/move-databricks-provider-to-new-structure branch 2 times, most recently from ba47389 to 47e80ce Compare January 30, 2025 03:50
@potiuk
Copy link
Member

potiuk commented Jan 30, 2025

conflicts again (but I hope this time things will be green).

@josix josix force-pushed the refactor/move-databricks-provider-to-new-structure branch from 47e80ce to ddd7062 Compare January 30, 2025 09:21
@josix
Copy link
Contributor Author

josix commented Jan 30, 2025

Thanks! Let me fix this first, I guess the patch for move_providers.py might affect other providers.

@josix josix force-pushed the refactor/move-databricks-provider-to-new-structure branch from ddd7062 to a251efc Compare January 30, 2025 10:36
@potiuk
Copy link
Member

potiuk commented Jan 30, 2025

Thanks! Let me fix this first, I guess the patch for move_providers.py might affect other providers.

Cool :)

@josix josix force-pushed the refactor/move-databricks-provider-to-new-structure branch from a251efc to c938dcb Compare January 30, 2025 17:18
Copy link
Contributor

@o-nikolas o-nikolas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you just barely missed a fix in main when you rebased an hour ago 😢

@josix josix force-pushed the refactor/move-databricks-provider-to-new-structure branch from c938dcb to 6906290 Compare January 30, 2025 19:14
@josix
Copy link
Contributor Author

josix commented Jan 30, 2025

Hmmm... I rebased after #46291 merged, and it indeed fixed the issues that caused the previous failed tests in cncf.kubernetes and microsoft.azure related to time.sleep. However, it looks like this is a different issue, and all the failed tests are using caplog (not related 🤔) .

@potiuk potiuk force-pushed the refactor/move-databricks-provider-to-new-structure branch from 732343e to 8bc5b21 Compare January 31, 2025 13:50
@josix
Copy link
Contributor Author

josix commented Feb 1, 2025

I'll continue fixing the flaky caplog-related unit tests once the lazy consensus is reached.

@josix josix force-pushed the refactor/move-databricks-provider-to-new-structure branch 2 times, most recently from 5d71eca to 9f3d6b5 Compare February 4, 2025 15:12
@potiuk potiuk force-pushed the refactor/move-databricks-provider-to-new-structure branch from 9f3d6b5 to 3e78d05 Compare February 5, 2025 19:00
@potiuk
Copy link
Member

potiuk commented Feb 5, 2025

Rebased - and the consensus is reached so ..... you can fix it :)

@potiuk
Copy link
Member

potiuk commented Feb 6, 2025

Some errors to look at :( . Happy to help if needed.

@josix josix force-pushed the refactor/move-databricks-provider-to-new-structure branch from 3e78d05 to ccc1f6e Compare February 6, 2025 12:55
@josix
Copy link
Contributor Author

josix commented Feb 6, 2025

Yeah, let me circle back to this. Sorry for being busy the past two days, but thankfully, I just finished it and can now focus on this issue.

@potiuk
Copy link
Member

potiuk commented Feb 6, 2025

Yeah, let me circle back to this. Sorry for being busy the past two days, but thankfully, I just finished it and can now focus on this issue.

No worries at all :), It's open -source. We do stuff when we can !

@josix josix force-pushed the refactor/move-databricks-provider-to-new-structure branch from ccc1f6e to 6efc3e7 Compare February 6, 2025 18:35
@potiuk
Copy link
Member

potiuk commented Feb 6, 2025

The wasb tests should be "fine" after this is merged #46539 -> then you can rebase and 🤞 we should get this one green.

@potiuk
Copy link
Member

potiuk commented Feb 6, 2025

OK. you can rebase it now :)

@josix josix force-pushed the refactor/move-databricks-provider-to-new-structure branch from 6efc3e7 to 4a1dad1 Compare February 7, 2025 03:10
@josix
Copy link
Contributor Author

josix commented Feb 7, 2025

Thanks, @potiuk! It turns out that the issue is caused from caplog, just as my initially assumed. However, I still can't figure out why it is related to building connection, given that the error shows airflow.exceptions.AirflowNotFoundException: The conn_id `` isn't defined. How does caplog affect the Hooks 🤔

Disposing DB connection pool (PID 75)
Unable to retrieve connection from secrets backend (MetastoreBackend). Checking subsequent secrets backend.
Traceback (most recent call last):
  File "/opt/airflow/airflow/models/connection.py", line 463, in get_connection_from_secrets
    conn = secrets_backend.get_connection(conn_id=conn_id)
  File "/opt/airflow/airflow/utils/session.py", line 100, in wrapper
    with create_session() as session:
  File "/usr/local/lib/python3.9/contextlib.py", line 119, in __enter__
    return next(self.gen)
  File "/opt/airflow/airflow/utils/session.py", line 39, in create_session
    raise RuntimeError("Session must be set before!")
RuntimeError: Session must be set before!
Exception when trying to check remote location: "The conn_id `` isn't defined"
Unable to retrieve connection from secrets backend (MetastoreBackend). Checking subsequent secrets backend.
Traceback (most recent call last):
  File "/opt/airflow/airflow/models/connection.py", line 463, in get_connection_from_secrets
    conn = secrets_backend.get_connection(conn_id=conn_id)
  File "/opt/airflow/airflow/utils/session.py", line 100, in wrapper
    with create_session() as session:
  File "/usr/local/lib/python3.9/contextlib.py", line 119, in __enter__
    return next(self.gen)
  File "/opt/airflow/airflow/utils/session.py", line 39, in create_session
    raise RuntimeError("Session must be set before!")
RuntimeError: Session must be set before!
Could not write logs to wasb://container/remote/log/location/1.log
Traceback (most recent call last):
  File "/opt/airflow/providers/src/airflow/providers/microsoft/azure/log/wasb_task_handler.py", line 203, in wasb_write
    self.hook.load_string(log, self.wasb_container, remote_log_location, overwrite=True)
  File "/opt/airflow/providers/src/airflow/providers/microsoft/azure/hooks/wasb.py", line 361, in load_string
    self.upload(
  File "/opt/airflow/providers/src/airflow/providers/microsoft/azure/hooks/wasb.py", line 418, in upload
    blob_client = self._get_blob_client(container_name, blob_name)
  File "/opt/airflow/providers/src/airflow/providers/microsoft/azure/hooks/wasb.py", line 233, in _get_blob_client
    return self.blob_service_client.get_blob_client(container=container_name, blob=blob_name)
  File "/usr/local/lib/python3.9/functools.py", line 993, in __get__
    val = self.func(instance)
  File "/opt/airflow/providers/src/airflow/providers/microsoft/azure/hooks/wasb.py", line 157, in blob_service_client
    return self.get_conn()
  File "/opt/airflow/providers/src/airflow/providers/microsoft/azure/hooks/wasb.py", line 161, in get_conn
    conn = self.get_connection(self.conn_id)
  File "/opt/airflow/airflow/hooks/base.py", line 64, in get_connection
    conn = Connection.get_connection_from_secrets(conn_id)
  File "/opt/airflow/airflow/models/connection.py", line 474, in get_connection_from_secrets
    raise AirflowNotFoundException(f"The conn_id `{conn_id}` isn't defined")
airflow.exceptions.AirflowNotFoundException: The conn_id `` isn't defined

@potiuk
Copy link
Member

potiuk commented Feb 7, 2025

The conn_id `` isn't defined. How does caplog affect the Hooks 🤔

Beats me... No idea other than there are side effects from other tests. I know removing it and replacing with direct mocking helped in pretty much all cases where it happened (many) - I am not going to lose any more time on diving deeper :) - we all lost already far too much time on looking at caplog issues IMHO.

@potiuk potiuk force-pushed the refactor/move-databricks-provider-to-new-structure branch from aba0a22 to 93f2853 Compare February 7, 2025 15:42
@potiuk
Copy link
Member

potiuk commented Feb 7, 2025

Pushed a fixup with conflict resolution :)

@potiuk potiuk merged commit 1cde11a into apache:main Feb 7, 2025
90 checks passed
@potiuk
Copy link
Member

potiuk commented Feb 7, 2025

wooohooo

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants